token usage AI News List | Blockchain.News
AI News List

List of AI News about token usage

Time Details
2026-04-16
18:38
Opus 4.7 Effort Levels Explained: Adaptive Thinking Settings for Faster or Smarter AI Responses

According to @bcherny on X, Opus 4.7 replaces fixed thinking budgets with adaptive thinking and introduces adjustable effort levels to trade off speed and token usage against reasoning depth and capability (source: X post by Boris Cherny, Apr 16, 2026). As reported by the same source, lower effort yields faster outputs with fewer tokens, while higher effort delivers more intelligent, capable responses, with xhigh recommended for most tasks and max for the hardest tasks. According to the post, the /effort command sets the level, and max applies only to the current session while other levels persist, signaling practical controls for enterprises to manage latency, cost per request, and quality. For AI product teams, this enables dynamic orchestration—e.g., defaulting to medium effort for routine prompts and programmatically escalating to xhigh or max for complex reasoning—optimizing infrastructure spend and user experience.

Source
2026-04-07
03:41
Meta’s Token Legends: Latest Analysis on AI Compute Leaderboards and Incentive Design in 2026

According to Ethan Mollick on X, Meta employees are competing to become “Token Legends,” ranking themselves by AI compute consumed, echoing the classic incentive risk warned in On the Folly of Rewarding A, While Hoping for B (Mollick shared the original paper link). As reported by The Information, internal leaderboards tie token usage to perceived productivity and influence, creating a status game where higher compute may signal impact (The Information). According to The Information, this metric could unintentionally reward excessive model calls over outcomes, raising cost, throughput, and model availability risks in large-scale LLM deployments. For AI leaders, the business opportunity is to implement outcome-aligned metrics—such as experiments shipped, latency budgets met, and unit economics per successful inference—while using governance controls like per-team quotas, cost dashboards, rate limiting, and evaluation harnesses to prevent compute gaming, as highlighted by The Information’s description of token-based status and Mollick’s incentive-design framing.

Source
2026-02-11
21:40
Claude Code Statusline: 7 Practical Ways to Monitor Model, Context, and Cost in 2026 (Latest Guide)

According to @bcherny, Claude Code now supports customizable status lines that appear below the composer to display the active model, working directory, remaining context, token usage, and cost, enabling developers to optimize workflow and manage spend in real time; as reported by code.claude.com, users can run /statusline to auto-generate a configuration from their .bashrc or .zshrc, lowering setup friction for engineering teams adopting AI pair programming at scale.

Source